EN FR
EN FR


Section: New Results

Modeling XML document transformations

Participants : Joachim Niehren, Sophie Tison, Sławek Staworko, Aurélien Lemay, Anne-Cécile Caron, Yves Roos, Shunichi Amano, Camille Vacher, Benoît Groz, Antoine Ndione, Tom Sebastian.

Query answering on XML streams. In [16] , Gauwin and Niehren introduce the notion of finite streamability for query languages, and classify fragments of XPath that are finitely streamable or not. They show that if a query language is finitely streamable, then its satisfiability problem can be solved in polynomial time, which in turn is known to fail for mostly all fragments of XPath . They also show that FXP , the fragment of ForwardXPath with child and descendant axis, conjunction, and negation becomes finitely streamable if bounding the number of conjunctions. Since 3 conjunctions are enough in many practical applications, FXP is most relevant in practice. Without any bound, FXP is not finitely streamable, since its satisfiablity problem is DEXPTIME hard. The positive result for FXP with a bounded number of conjunctions is obtained by compilation of FXP to deterministic nested word automata. The compiler is in exponential time in the number of conjunctions, and thus polynomial if this parameter is bounded.

Answer enumeration for n-ary queries. Bagan, Filiot, Gauwin, and Niehren investigated answer enumeration algorithms for dialects of XPath with variables. The problem with n-ary queries is that answer sets may grow exponentially in |t| n , so that algorithms depending polynomially on the size of the answer set might still be unfeasible. In such case, it might still be possible to enumerate elements of answer sets on need. The questions is then whether enumeration can be done efficiently without duplicates and failures, that is with constant delay between subsequent answers and polynomial time preprocessing in the size of the query and the tree. We obtained positive results on answer enumeration with constant delay enumeration for acyclic conjunctive queries over so called X-doublebar structures that we introduce [24] . These subsume tree structures with child, next-sibling and next-sibling * axis, but not the descendant axis. Our result can be lifted to a dialect of ConditionalXPath with variables, that is FO -complete on trees of bounded depth, so that the descendant axis is not needed.

Tree automata global constraints. TAGEDs are a new class of tree automata with constraints that currently receive much interest from top conferences on theoretical computer science. During its postdoc in Lille, Vacher improved complexity bounds for some fragments of decidability results [12] .

Sequential tree-to-word transducers. Laurence, Lemay, Niehren, Staworko, Tommasi considered deterministic sequential top-down tree-to-word transducers (STWs ), that capture the class of deterministic top-down nested-word to word transducers. While reordering and copying are not allowed, STWs are nevertheless very expressive because they allow concatenation of outputs, deletion of inner nodes and they can produce context free languages as output. Their expressiveness is incomparable with DTOPs (plus concatenation, but minus copying). While objecting for learning algorithms, they study normalization of STWs in a first step and then develop unique minimalization algorithms for normalized STWs in a second step in [19] . The idea of normalization is to produce the output in an earliest manner, when reading the input in document order. This works only on binary trees, but can be lifted to unranked trees modulo the binary top-down encoding. The normalization algorithm is by far nontrivial. The natural continuation of this approach will be toward learning algorithms for earliest STWs .

Access control for XML views. The PhD project of Groz, supervised by Staworko and Tison, is centered on access control for XML databases, and in particular on security of user views over XML documents. He obtained results on query rewriting for read-only queries, and translation for update queries. More precisely, given an XML view definition and a user defined query (resp. update program) q, the problem is to find a source query (resp. update program) that is equivalent to q on the view. Caron, Groz, Roos, Staworko and Tison study update programs and views represented by recognizable tree languages in [15] , and devise algorithms for update translation in different settings, namely without or with constraints on the authorised source updates.